AITopics | Pattern Recognition

Collaborating Authors

Pattern Recognition

"... the research area that studies the operation and design of systems that recognize patterns in data." It includes statistical methods like discriminant analysis, feature extraction, error estimation, cluster analysis.
– Pattern Recognition Laboratory at Delft University of Technology

News Overviews Instructional Materials AI-Alerts Classics

Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification

Neural Information Processing SystemsMay-28-2025, 15:34:38 GMT

We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical character recognition, featuring various image scales (microscopic, human scales, remote sensing). All datasets are preprocessed, annotated, and formatted uniformly, and come in 3 versions (Micro Mini Extended) to match users' computational resources.

artificial intelligence, machine learning, pattern recognition, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Asia (0.93)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.62)
(3 more...)

Add feedback

Connectionist Temporal Classification with Maximum Entropy Regularization

Hu Liu, Sheng Jin, Changshui Zhang

Neural Information Processing SystemsMay-26-2025, 10:41:24 GMT

Connectionist Temporal Classification (CTC) is an objective function for end-toend sequence learning, which adopts dynamic programming algorithms to directly learn the mapping between sequences. CTC has shown promising results in many sequence learning applications including speech recognition and scene text recognition. However, CTC tends to produce highly peaky and overconfident distributions, which is a symptom of overfitting. To remedy this, we propose a regularization method based on maximum conditional entropy which penalizes peaky distributions and encourages exploration. We also introduce an entropybased pruning method to dramatically reduce the number of CTC feasible paths by ruling out unreasonable alignments. Experiments on scene text recognition show that our proposed methods consistently improve over the CTC baseline without the need to adjust training settings.

artificial intelligence, machine learning, pattern recognition, (21 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.42)

Add feedback

Bilevel Distance Metric Learning for Robust Image Recognition

Jie Xu, Lei Luo, Cheng Deng, Heng Huang

Neural Information Processing SystemsMay-26-2025, 07:52:36 GMT

Metric learning, aiming to learn a discriminative Mahalanobis distance matrix M that can effectively reflect the similarity between data samples, has been widely studied in various image recognition problems. Most of the existing metric learning methods input the features extracted directly from the original data in the preprocess phase. What's worse, these features usually take no consideration of the local geometrical structure of the data and the noise that exists in the data, thus they may not be optimal for the subsequent metric learning task. In this paper, we integrate both feature extraction and metric learning into one joint optimization framework and propose a new bilevel distance metric learning model. Specifically, the lower level characterizes the intrinsic data structure using graph regularized sparse coefficients, while the upper level forces the data samples from the same class to be close to each other and pushes those from different classes far away. In addition, leveraging the KKT conditions and the alternating direction method (ADM), we derive an efficient algorithm to solve the proposed new model. Extensive experiments on various occluded datasets demonstrate the effectiveness and robustness of our method.

artificial intelligence, machine learning, pattern recognition, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Simple Cache Model for Image Recognition

Emin Orhan

Neural Information Processing SystemsMay-26-2025, 07:16:53 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, pattern recognition, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

LinkNet: Relational Embedding for Scene Graph

Sanghyun Woo, Dahun Kim, Donghyeon Cho, In So Kweon

Neural Information Processing SystemsMay-26-2025, 06:37:33 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, natural language, pattern recognition, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

f7b424d242cc6bb7708cff241367334d-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-25-2025, 17:02:21 GMT

machine learning, pattern recognition, recognition, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.31)

Add feedback

NAVI: Category-Agnostic Image Collections with High-Quality 3D Shape and Pose Annotations

Neural Information Processing SystemsMay-25-2025, 16:08:14 GMT

Recent advances in neural reconstruction enable high-quality 3D object reconstruction from casually captured image collections. Current techniques mostly analyze their progress on relatively simple image collections where Structurefrom-Motion (SfM) techniques can provide ground-truth (GT) camera poses. We note that SfM techniques tend to fail on in-the-wild image collections such as image search results with varying backgrounds and illuminations. To enable systematic research progress on 3D reconstruction from casual image captures, we propose'NAVI': a new dataset of category-agnostic image collections of objects with high-quality 3D scans along with per-image 2D-3D alignments providing near-perfect GT camera parameters. These 2D-3D alignments allow us to extract accurate derivative annotations such as dense pixel correspondences, depth and segmentation maps. We demonstrate the use of NAVI image collections on different problem settings and show that NAVI enables more thorough evaluations that were not possible with existing datasets. We believe NAVI is beneficial for systematic research progress on 3D reconstruction and correspondence estimation.

artificial intelligence, machine learning, pattern recognition, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Industry: Media > Photography (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Communications (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.35)

Add feedback

e24570da4fa1c005b189104250993aee-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 14:32:33 GMT

machine learning, natural language, pattern recognition, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Law > Civil Rights & Constitutional Law (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
(2 more...)

Add feedback

EV-Eye: Rethinking High-frequency Eye Tracking through the Lenses of Event Cameras,Yurun Yang

Neural Information Processing SystemsMay-25-2025, 11:07:15 GMT

In this paper, we present EV-Eye, a first-of-its-kind large-scale multimodal eye tracking dataset aimed at inspiring research on high-frequency eye/gaze tracking. EV-Eye utilizes the emerging bio-inspired event camera to capture independent pixel-level intensity changes induced by eye movements, achieving submicrosecond latency. Our dataset was curated over two weeks and collected from 48 participants encompassing diverse genders and age groups. It comprises over 1.5 million near-eye grayscale images and 2.7 billion event samples generated by two DAVIS346 event cameras. Additionally, the dataset contains 675 thousand scene images and 2.7 million gaze references captured by a Tobii Pro Glasses 3 eye tracker for cross-modality validation. Compared with existing event-based high-frequency eye tracking datasets, our dataset is significantly larger in size, and the gaze references involve more natural and diverse eye movement patterns, i.e., fixation, saccade, and smooth pursuit. Alongside the event data, we also present a hybrid eye tracking method as a benchmark, which leverages both the near-eye grayscale images and event data for robust and high-frequency eye tracking. We show that our method achieves higher accuracy for both pupil and gaze estimation tasks compared to the existing solution.

artificial intelligence, machine learning, pattern recognition, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: